1 About

1.1 Contributions

Please note that authorship is alphabetical. Contributions are listed below - see github for details and who to blame for what :-).

1.3 Citation

If you wish to refer to any of the material from this report please cite as:

  • Anderson, B., (2019) : , University of Southampton: Southampton, UK.

Report circulation:

  • Public

Report purpose:

This work is (c) 2019 the University of Southampton.

3 Load data

Load previously processed data…

dataPath <- path.expand("~/Data/SCC/airQual/")

# merge them all
files <- list.files(paste0(dataPath, "/processed/"), pattern = "*.gz", full.names = TRUE)

l <- lapply(files, data.table::fread)
dt <- rbindlist(l, fill = TRUE)

skimr::skim(dt)
## Skim summary statistics
##  n obs: 104844 
##  n variables: 10 
## 
## ── Variable type:character ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────
##            variable missing complete      n min max empty n_unique
##  MeasurementDateGMT       0   104844 104844  16  16     0    17474
##                site       0   104844 104844  22  31     0        6
## 
## ── Variable type:logical ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────
##  variable missing complete      n mean  count
##        co  104844        0 104844  NaN 104844
## 
## ── Variable type:numeric ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────
##  variable missing complete      n  mean    sd   p0  p25  p50  p75  p100
##       nox   44412    60432 104844 22.37 33.09 -5    4.2 11.5 27.3 492.3
##      nox2   44417    60427 104844 34.16 20.03  0   19.1 30.9 45.5 174.7
##     noxes   68605    36239 104844 71.05 64.01  1.5 32.3 54.9 87.8 904.8
##        oz   88642    16202 104844 40.86 23.99 -0.2 23.7 41.3 56.5 174.1
##      pm10   74617    30227 104844 18.02 10.97 -1.5 11   15.5 22.2 252.5
##     pm2_5   89429    15415 104844 12.98  9.41 -4    7.3 10.3 15.6 239.1
##       so2   89058    15786 104844  4.01  3.87 -1.1  1.2  2.4  5.8  49.6
##      hist
##  ▇▁▁▁▁▁▁▁
##  ▆▇▃▁▁▁▁▁
##  ▇▁▁▁▁▁▁▁
##  ▆▇▇▂▁▁▁▁
##  ▇▁▁▁▁▁▁▁
##  ▇▁▁▁▁▁▁▁
##  ▇▂▁▁▁▁▁▁
dt[, `:=`(obsDateTime, lubridate::ymd_hm(MeasurementDateGMT))]

t <- dt[, .(`co: Carbon Monoxide, mg/m3` = mean(co, na.rm = TRUE), `nox = Nitric Oxide, ug/m3` = mean(nox, 
    na.rm = TRUE), `nox2 = Nitrogen Dioxide, ug/m3` = mean(nox2, na.rm = TRUE), 
    `noxes = Oxides of Nitrogen, ug/m3` = mean(noxes, na.rm = TRUE), `oz = ozone, ug/m3` = mean(oz, 
        na.rm = TRUE), `pm10, ug/m3` = mean(pm10, na.rm = TRUE), `pm2_5, ug/m3` = mean(pm2_5, 
        na.rm = TRUE), `so2 = Sulphur Dioxide, ug/m3` = mean(so2, na.rm = TRUE)), 
    keyby = .(site)]

kableExtra::kable(t, caption = "Mean values per site (NaN indicates not measured)") %>% 
    kable_styling()
Table 3.1: Mean values per site (NaN indicates not measured)
site co: Carbon Monoxide, mg/m3 nox = Nitric Oxide, ug/m3 nox2 = Nitrogen Dioxide, ug/m3 noxes = Oxides of Nitrogen, ug/m3 oz = ozone, ug/m3 pm10, ug/m3 pm2_5, ug/m3 so2 = Sulphur Dioxide, ug/m3
Southampton - A33 Roadside AURN NaN 26.49725 32.27554 NaN NaN 16.89069 NaN NaN
Southampton - Bitterne NaN NaN NaN NaN NaN NaN NaN NaN
Southampton - Onslow Road NaN 25.89773 39.91412 79.62518 NaN NaN NaN NaN
Southampton - Redbridge NaN NaN NaN NaN NaN NaN NaN NaN
Southampton - Victoria Road NaN 25.23369 36.80428 75.49788 NaN NaN NaN NaN
Southampton Background AURN NaN 12.60205 28.37414 47.85700 40.86236 19.30788 12.97722 4.009033

Table 3.1 gives an indication of the availability of the different measures.

4 Nitric Oxide

t <- dt[, .(mean = mean(nox, na.rm = TRUE), sd = sd(nox, na.rm = TRUE), min = min(nox, 
    na.rm = TRUE), max = max(nox, na.rm = TRUE)), keyby = .(site)]
kableExtra::kable(t, caption = "Summary of nox data") %>% kable_styling()
Table 4.1: Summary of nox data
site mean sd min max
Southampton - A33 Roadside AURN 26.49725 36.26582 0.0 404.9
Southampton - Bitterne NaN NA Inf -Inf
Southampton - Onslow Road 25.89773 33.59185 -2.9 444.1
Southampton - Redbridge NaN NA Inf -Inf
Southampton - Victoria Road 25.23369 34.93944 -5.0 492.3
Southampton Background AURN 12.60205 24.85957 0.1 395.3

Table 4.1 suggests that there may be a few (150) negative values. These are summarised in 4.2.

t <- head(dt[nox < 0], 10)
kableExtra::kable(t, caption = "Negative nox values (first 10)") %>% kable_styling()
Table 4.2: Negative nox values (first 10)
MeasurementDateGMT nox nox2 noxes pm10 site co oz pm2_5 so2 obsDateTime
2018-01-29 03:00 -0.4 16.7 16.1 NA Southampton - Onslow Road NA NA NA NA 2018-01-29 03:00:00
2018-05-26 22:00 -0.1 13.5 13.3 NA Southampton - Onslow Road NA NA NA NA 2018-05-26 22:00:00
2018-06-08 02:00 -0.6 14.7 13.9 NA Southampton - Onslow Road NA NA NA NA 2018-06-08 02:00:00
2018-06-10 21:00 -0.2 29.5 29.3 NA Southampton - Onslow Road NA NA NA NA 2018-06-10 21:00:00
2018-06-11 19:00 -0.1 13.8 13.5 NA Southampton - Onslow Road NA NA NA NA 2018-06-11 19:00:00
2018-06-14 03:00 -0.1 7.6 7.5 NA Southampton - Onslow Road NA NA NA NA 2018-06-14 03:00:00
2018-06-27 19:00 -0.1 19.0 18.8 NA Southampton - Onslow Road NA NA NA NA 2018-06-27 19:00:00
2018-06-27 21:00 -0.2 26.5 26.2 NA Southampton - Onslow Road NA NA NA NA 2018-06-27 21:00:00
2018-07-03 22:00 -0.1 20.4 20.3 NA Southampton - Onslow Road NA NA NA NA 2018-07-03 22:00:00
2018-07-21 02:00 -0.3 14.6 14.2 NA Southampton - Onslow Road NA NA NA NA 2018-07-21 02:00:00
t <- table(dt[nox < 0]$site)
kableExtra::kable(t, caption = "Negative nox values (count by site)") %>% kable_styling()
Table 4.2: Negative nox values (count by site)
Var1 Freq
Southampton - Onslow Road 113
Southampton - Victoria Road 37
p <- ggplot2::ggplot(dt, aes(x = obsDateTime, y = site, fill = nox)) + geom_tile() + 
    scale_fill_continuous(low = "green", high = "red") + labs(x = "Time")
p
nox data availability

Figure 4.1: nox data availability

Figure 4.2 shows hourly values for all sites.

p <- ggplot2::ggplot(dt, aes(x = obsDateTime, y = nox, colour = site)) + geom_line()

p <- p + theme(legend.position = "bottom")

plotly::ggplotly(p)  # interactive

Figure 4.2: Nitric Oxide levels, Southampton (hourly)

dt[, `:=`(obsDate, lubridate::date(obsDateTime))]
plotDT <- dt[, .(mean_nox = mean(nox, na.rm = TRUE)), keyby = .(obsDate, site)]

p <- ggplot2::ggplot(plotDT, aes(x = obsDate, y = mean_nox, colour = site)) + 
    geom_line()

p <- p + theme(legend.position = "bottom")

plotly::ggplotly(p)  # interactive

Figure 4.3: Nitric Oxide levels, Southampton (daily mean - use mouse to hover over data)

Figure 4.3 shows daily values for all sites.

Clearly the mean daily values show less variance (and less extremes) than the hourly data.

5 PM 10

PM 10 data: has more sensors and wider coverage than PM2.5

t <- dt[, .(mean = mean(pm10, na.rm = TRUE), sd = sd(pm10, na.rm = TRUE), min = min(pm10, 
    na.rm = TRUE), max = max(pm10, na.rm = TRUE)), keyby = .(site)]
kableExtra::kable(t, caption = "Summary of pm10 data") %>% kable_styling()
Table 5.1: Summary of pm10 data
site mean sd min max
Southampton - A33 Roadside AURN 16.89069 10.63913 0.0 106.4
Southampton - Bitterne NaN NA Inf -Inf
Southampton - Onslow Road NaN NA Inf -Inf
Southampton - Redbridge NaN NA Inf -Inf
Southampton - Victoria Road NaN NA Inf -Inf
Southampton Background AURN 19.30788 11.20350 -1.5 252.5

Table 5.1 suggests that there may be a few (7) negative values. These are shown in 5.2.

t <- head(dt[pm10 < 0], nrow(dt[pm10 < 0]))
kableExtra::kable(t, caption = "Negative PM10 values") %>% kable_styling()
Table 5.2: Negative PM10 values
MeasurementDateGMT nox nox2 noxes pm10 site co oz pm2_5 so2 obsDateTime obsDate
2018-01-02 12:00 9.3 26.0 40.3 -0.4 Southampton Background AURN NA 57.4 2.9 2.0 2018-01-02 12:00:00 2018-01-02
2018-09-12 00:00 1.7 10.0 12.6 -1.4 Southampton Background AURN NA 25.0 6.1 0.9 2018-09-12 00:00:00 2018-09-12
2018-12-23 06:00 2.2 17.4 20.8 -0.8 Southampton Background AURN NA 54.1 2.3 3.5 2018-12-23 06:00:00 2018-12-23
2018-12-30 13:00 5.3 18.9 26.9 -0.2 Southampton Background AURN NA 43.9 5.6 1.3 2018-12-30 13:00:00 2018-12-30
2018-12-30 14:00 3.8 17.5 23.4 -1.5 Southampton Background AURN NA 45.6 4.9 1.1 2018-12-30 14:00:00 2018-12-30
2018-12-30 16:00 7.2 28.9 39.9 -0.1 Southampton Background AURN NA 37.8 6.0 1.3 2018-12-30 16:00:00 2018-12-30
2018-12-30 19:00 2.6 16.3 20.3 -1.0 Southampton Background AURN NA 52.9 6.0 1.0 2018-12-30 19:00:00 2018-12-30
p <- ggplot2::ggplot(dt, aes(x = obsDateTime, y = site, fill = pm10)) + geom_tile() + 
    scale_fill_continuous(low = "green", high = "red") + labs(x = "Time")
p

Figure 5.1 shows hourly values for all sites.

p <- ggplot2::ggplot(dt, aes(x = obsDateTime, y = pm10, colour = site)) + geom_line()

p <- p + theme(legend.position = "bottom")

plotly::ggplotly(p)  # interactive

Figure 5.1: PM10 levels, Southampton (hourly - use mouse to hover over data)

pm10DT <- dt[!is.na(pm10)]
dt[, `:=`(obsDate, lubridate::date(obsDateTime))]
plotDT <- dt[, .(mean_pm10 = mean(pm10, na.rm = TRUE)), keyby = .(obsDate, site)]

p <- ggplot2::ggplot(plotDT, aes(x = obsDate, y = mean_pm10, colour = site)) + 
    geom_line()

p <- p + theme(legend.position = "bottom") + geom_hline(yintercept = dailyPm10Threshold_WHO, 
    colour = "red")

plotly::ggplotly(p)  # interactive

Figure 5.2: PM10 levels, Southampton (daily mean, WHO daily threshold shown in red - use mouse to hover over data)

Figure 5.2 shows daily values for all sites and indicates those that cross the:

Clearly the mean daily values show less variance (and less extremes) than the hourly data.

6 PM 2.5

t <- dt[, .(mean = mean(pm2_5, na.rm = TRUE), sd = sd(pm2_5, na.rm = TRUE), 
    min = min(pm2_5, na.rm = TRUE), max = max(pm2_5, na.rm = TRUE)), keyby = .(site)]
kableExtra::kable(t, caption = "Summary of pm2_5 data") %>% kable_styling()
Table 6.1: Summary of pm2_5 data
site mean sd min max
Southampton - A33 Roadside AURN NaN NA Inf -Inf
Southampton - Bitterne NaN NA Inf -Inf
Southampton - Onslow Road NaN NA Inf -Inf
Southampton - Redbridge NaN NA Inf -Inf
Southampton - Victoria Road NaN NA Inf -Inf
Southampton Background AURN 12.97722 9.410799 -4 239.1

Table 6.1 suggests that there may be a few (12) negative values. These are shown in 6.2.

t <- head(dt[pm2_5 < 0], nrow(dt[pm2_5 < 0]))
kableExtra::kable(t, caption = "Negative pm2_5 values") %>% kable_styling()
Table 6.2: Negative pm2_5 values
MeasurementDateGMT nox nox2 noxes pm10 site co oz pm2_5 so2 obsDateTime obsDate
2018-01-01 13:00 2.4 13.0 16.7 5.6 Southampton Background AURN NA 66.2 -3.0 1.8 2018-01-01 13:00:00 2018-01-01
2018-01-01 14:00 1.2 8.4 10.2 17.4 Southampton Background AURN NA 71.7 -2.9 1.7 2018-01-01 14:00:00 2018-01-01
2018-01-05 15:00 21.2 60.4 93.0 18.4 Southampton Background AURN NA 25.0 -0.9 1.4 2018-01-05 15:00:00 2018-01-05
2018-01-15 07:00 5.2 16.0 24.0 5.4 Southampton Background AURN NA 68.3 -4.0 1.3 2018-01-15 07:00:00 2018-01-15
2018-01-18 14:00 12.9 43.2 63.0 5.4 Southampton Background AURN NA 43.9 -0.8 1.1 2018-01-18 14:00:00 2018-01-18
2018-01-18 15:00 14.3 47.5 69.4 7.7 Southampton Background AURN NA 37.6 -0.9 0.9 2018-01-18 15:00:00 2018-01-18
2018-01-28 07:00 2.1 24.2 27.4 5.5 Southampton Background AURN NA 54.4 -1.8 0.8 2018-01-28 07:00:00 2018-01-28
2018-08-24 04:00 3.0 10.9 15.5 11.1 Southampton Background AURN NA 26.4 -1.4 0.8 2018-08-24 04:00:00 2018-08-24
2018-08-25 05:00 6.2 16.5 26.1 7.1 Southampton Background AURN NA 20.2 -2.0 0.5 2018-08-25 05:00:00 2018-08-25
2018-09-07 04:00 2.3 13.3 16.9 13.2 Southampton Background AURN NA 23.0 -1.9 0.4 2018-09-07 04:00:00 2018-09-07
2018-10-27 07:00 4.8 16.4 23.7 8.6 Southampton Background AURN NA 35.9 -0.9 0.8 2018-10-27 07:00:00 2018-10-27
2018-12-07 02:00 0.9 6.1 7.5 7.0 Southampton Background AURN NA 74.0 -1.7 0.2 2018-12-07 02:00:00 2018-12-07
p <- ggplot2::ggplot(dt, aes(x = obsDateTime, y = site, fill = pm2_5)) + geom_tile() + 
    scale_fill_continuous(low = "green", high = "red") + labs(x = "Time")
p

Figure 6.1 shows hourly values for all sites.

p <- ggplot2::ggplot(dt, aes(x = obsDateTime, y = pm2_5, colour = site)) + geom_line()

p <- p + theme(legend.position = "bottom")

plotly::ggplotly(p)  # interactive

Figure 6.1: PM2_5 levels, Southampton (hourly - use mouse to hover over data)

dt[, `:=`(obsDate, lubridate::date(obsDateTime))]
plotDT <- dt[, .(mean_pm2_5 = mean(pm2_5, na.rm = TRUE)), keyby = .(obsDate, 
    site)]

p <- ggplot2::ggplot(plotDT, aes(x = obsDate, y = mean_pm2_5, colour = site)) + 
    geom_line()

p <- p + theme(legend.position = "bottom") + geom_hline(yintercept = dailyPm2.5Threshold_WHO, 
    colour = "red")


plotly::ggplotly(p)  # interactive

Figure 6.2: PM2_5 levels, Southampton (daily mean, WHO daily threshold shown in red - use mouse to hover over data)

Figure 6.2 shows daily values for all sites and indicates those that cross the:

Clearly the mean daily values show less variance (and less extremes) than the hourly data.

7 Observations

Something happened on the 27th October 2019 at 20:00. There are spikes on all hourly plots (although this is masked in the daily plots). Could it have been a cruise ship leaving?

Something else happened on the 2nd December 2019 at 21:00. Was this another ship?

8 Runtime

Report generated using knitr in RStudio with R version 3.5.2 (2018-12-20) running on x86_64-apple-darwin15.6.0 (Darwin Kernel Version 17.7.0: Fri Oct 4 23:08:59 PDT 2019; root:xnu-4570.71.57~1/RELEASE_X86_64).

t <- proc.time() - startTime

elapsed <- t[[3]]

Analysis completed in 52.194 seconds ( 0.87 minutes).

R packages used:

  • data.table - (Dowle et al. 2015)
  • ggplot2 - (Wickham 2009)
  • here - (???)
  • kableExtra - (Zhu 2018)
  • lubridate - (Grolemund and Wickham 2011)
  • plotly - (Sievert et al. 2016)
  • skimr - (Arino de la Rubia et al. 2017)

References

Arino de la Rubia, Eduardo, Hao Zhu, Shannon Ellis, Elin Waring, and Michael Quinn. 2017. Skimr: Skimr. https://github.com/ropenscilabs/skimr.

Dowle, M, A Srinivasan, T Short, S Lianoglou with contributions from R Saporta, and E Antonyan. 2015. Data.table: Extension of Data.frame. https://CRAN.R-project.org/package=data.table.

Grolemund, Garrett, and Hadley Wickham. 2011. “Dates and Times Made Easy with lubridate.” Journal of Statistical Software 40 (3): 1–25. http://www.jstatsoft.org/v40/i03/.

Sievert, Carson, Chris Parmer, Toby Hocking, Scott Chamberlain, Karthik Ram, Marianne Corvellec, and Pedro Despouy. 2016. Plotly: Create Interactive Web Graphics via ’Plotly.js’. https://CRAN.R-project.org/package=plotly.

Wickham, Hadley. 2009. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. http://ggplot2.org.

Zhu, Hao. 2018. KableExtra: Construct Complex Table with ’Kable’ and Pipe Syntax. https://CRAN.R-project.org/package=kableExtra.